Enrichment analyses of meta-analyses: evidence mapping, bibliometrics, and alternative impact metrics

Visualizing study characteristics, hidden risk of bias, societal influence, and research translation

Contributors

Yefeng Yang, Malgorzata Lagisz, Shinichi Nakagawa

Update

Last update Dec. 2023

Preface

The R script provided with this html reproduces all the figures in the main text. For the explanation of each figure, see the main text. Version information of each package used in this html is listed at the end. The relevant GitHub repository can be found at https://github.com/Yefeng0920/MA_Map_Bib.

Citation

If our paper and R script have helped you, please cite the following paper:

Yefeng Yang, Malgorzata Lagisz, Shinichi Nakagawa. Enriching meta-analyses through scoping review, bibliometrics, and alternative impact metrics: Visualizing study characteristics, hidden risk of bias, societal influence, and research translation. arXiv, 2023.

Contact

If you have any questions, mistakes, or bug to report, please contact corresponding authors:

  • Dr. Yefeng Yang

Evolution & Ecology Research Centre, EERC School of Biological, Earth and Environmental Sciences, BEES The University of New South Wales, Sydney, Australia

Email:

  • Professor Shinichi Nakagawa, PhD, FRSN

Evolution & Ecology Research Centre, EERC School of Biological, Earth and Environmental Sciences, BEES The University of New South Wales, Sydney, Australia

Email:

Set-up

Setting global options will apply those options to all of the following chunks of code in the tutorial.

Our illustrations use R statistical software and existing R packages, which you will first need to download and install.

If you do not have it on your machine, first install R (download). We recommend also downloading RStudio, a popular integrated development environment for coding with R, created by a company named posit (download).

After installing R, you must install several packages that contain necessary functions for performing the analyses in this tutorial. If the packages are archived in CRAN, use install.packages() to install them. To install packages that are not on CRAN and archived in Github repositories, execute devtools::install_github().

The package list is as follows:

tidyverse, here, DT, ggpubr, readxl, metafor, lme4, car, ggplot2, cowplot, visdat, naniar, viridis, ggthemr, pander, formatR, rotl, ape, ggstance, ggtree, flextable, bibliometrix, circlize, igraph.

Custom function

We provide helper functions necessary for our illustrations. You can load these custom functions from the .R file in Function sub-directory using the source function and the code provided below. Alternatively, you can copy and paste the code of these functions into your console and execute the code them to load functions into your computer’s working memory.

# custom function
source(here("Function", "custom.R"))

Enrichment 1: Evidence mapping

The first enrichment analysis is evidence mapping (sometimes termed scoping review, evidence review). It contains three types of plots: (1) grid-like plots (Figure I), (2) Sankey diagrams (Figure II), and (3) phylogenetic trees (Figure III).

Grid-like plots

We use Data 1 in the main text to illustrate how to make the grid-like plots (Figure I). Data 1 was originally collected by Hodkinson et al. (1), who conducted a network meta-analysis to assess the efficacy of different self-management interventions (multidisciplinary case management, regularly supported self-management, and minimally supported self-management) in enhancing the quality of life among asthma patients. Note that the following R code was adapted from (2).

Data 1

# load data
dat <- read_xlsx(here("Data", "Hodkinson_2020.xlsx"))

# preprocess
dat$vi <- dat$`Std Err2`^2
dat <- dat[, c("Intervention model", "Outcome", "Hedgesg", "vi", "Study name")]
names(dat) <- c("dimension1", "dimension2", "yi", "vi", "study_id")

# show data in a datatable
t1 <- dat %>%
    dfround(3) %>%
    DT::datatable()
t1

Visualization

Figure I shown in the main text can be made with:

# get estimate for each cell
est_dat <- dat %>% group_by(dimension1, dimension2) %>%
  group_modify(~ meta_aggregate(.x, rho = 0.5)) %>% ungroup()

# traditional map with the number of study
est_dat$dimension1 <- as.factor(est_dat$dimension1)

Box1_map1 <- ggplot(est_dat, aes(x = dimension1, y = dimension2, size = n_studies) ) +
  geom_point(alpha = 0.5, color = RColorBrewer::brewer.pal(n = 8, name = "Dark2")[1]) + 
  labs(x = "Dimension 1 (Intervention)", y = "Dimension 2 (Outcome)") +
  scale_size(range=c(5,10)) +
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10)) + # 
  scale_y_discrete(labels = function(x) str_wrap(x, width = 10)) +
  theme_bw() +
  guides(size = "none") +
  theme(legend.position='top', 
        legend.justification='right',
        legend.direction='horizontal', 
        axis.text = element_text(color = "black"),
        axis.title = element_text(color = "black")) +
  geom_text(aes(label = as.character(n_studies)), size = 4, color = "gray10") +
  labs(caption = "The value in the cell is the number of studies") + 
   theme(plot.caption = element_text(size = 10, color = "gray10", face = "italic"),
         axis.text = element_text(size = 12),
         axis.title = element_text(size = 12, face = "bold")) 


# traditional map with the number of effect size
Box1_map2 <- ggplot(est_dat, aes(x = dimension1, y = dimension2, size = n_es)) +
  geom_point(alpha = 0.5, color = RColorBrewer::brewer.pal(n = 8, name = "Dark2")[3]) + 
  labs(x = "Dimension 1 (Intervention)", y = "Dimension 2 (Outcome)") +
  scale_size(range=c(5,10)) +
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10)) +
  scale_y_discrete(labels = function(x) str_wrap(x, width = 10)) +
  theme_bw() +
  guides(size = "none") +
  theme(legend.position='top', 
        legend.justification='right',
        legend.direction='horizontal',  
        axis.text = element_text(color = "black"),
        axis.title = element_text(color = "black")) +
  geom_text(aes(label = as.character(n_es)), size = 4, color = "gray10") +
  labs(caption = "The value in the cell is the number of effect sizes") +
   theme(plot.caption = element_text(size = 10, color = "gray10", face = "italic"),
         axis.text = element_text(size = 12),
         axis.title = element_text(face = "bold", size = 12))



# with effect size information
Box1_map3 <- ggplot(est_dat, aes(x = dimension1, y = dimension2, size = n_es, color = estimate)) +
  geom_point(alpha = 0.6) + 
  scale_color_gradient(
    low = "blue",
    high = "red", 
    limits = c(-1,1),
    guide = "colourbar") + 
  labs(x = "Dimension 1 (Intervention)", y = "Dimension 2 (Outcome)", color = "Meta-analytic mean effect size") +
  scale_size(range=c(5,10)) +
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10)) +
  scale_y_discrete(labels = function(x) str_wrap(x, width = 10)) +
  theme_bw() +
  guides(size = "none") +
  theme(legend.position='top', 
        legend.justification='right',
        legend.direction='horizontal', 
        axis.text = element_text(color = "black"),
        axis.title = element_text(color = "black")) +
  geom_text(aes(label = as.character(n_es)), size = 4, color = "gray10") +
  labs(caption = "The value in the cell is the number of effect sizes") + 
   theme(plot.caption = element_text(size = 10, color = "gray10", face = "italic"),
         axis.text = element_text(size = 12),
         axis.title = element_text(face = "bold", size = 12))


# 3-dimensional graph - we do not recommend to 3 dimensional graph because this is not that friendly to end-user
dat <- read_xlsx(here("Data","Hodkinson_2020.xlsx"))
dat$vi <- dat$`Std Err2`^2
dat <- dat[, c("Intervention model", "Outcome", "Age group", "Hedgesg", "vi", "Study name")]
names(dat) <- c("dimension1", "dimension2", "dimension3", "yi", "vi", "study_id")

# get estimate each cell 
est_dat <- dat %>% group_by(dimension1, dimension2, dimension3) %>%
  group_modify(~ meta_aggregate(.x, rho = 0.5)) %>% ungroup()

# with the third dimension
Box1_map4 <- ggplot(est_dat, aes(x = dimension1, y = dimension2, 
                      shape = dimension3,
                      size = n_es  
                      )) +
  geom_point(alpha = 0.6,  color = RColorBrewer::brewer.pal(n = 8, name = "Dark2")[5], aes(group = dimension3), position = position_dodge(width= 0.5)) +
 scale_color_gradient(
    low = "blue",
    high = "red", 
    limits = c(-1,1),
    guide = "colourbar") + 
  scale_shape_manual(values = c(15, 19, 17, 18, 10, 4, 3, 7, 8, 13)) +
  labs(x = "Dimension 1 (Intervention)", y = "Dimension 2 (Outcome)", shape = "Diamension 3 (Population)") +
  scale_size(range=c(5,10)) +
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10)) +
  scale_y_discrete(labels = function(x) str_wrap(x, width = 10)) +
  theme_bw() +
  guides(size = "none") +
  theme(legend.position='top', 
        legend.justification='right',
        legend.direction='horizontal', 
        axis.text = element_text(color = "black"),
        axis.title = element_text(color = "black")) +
  geom_text(aes(label = as.character(n_es), group = dimension3), size = 4, color = "gray10", position = position_dodge(width = .5)) +
  labs(caption = "The value in the cell is the number of effect sizes") + 
   theme(plot.caption = element_text(size = 10, color = "gray10", face = "italic"),
         axis.text = element_text(size = 12),
         axis.title = element_text(face = "bold", size = 12))


#Box1_map1 + Box1_map2 + Box1_map3 + Box1_map4 + patchwork::plot_layout(ncol = 2, nrow = 2) + patchwork::plot_annotation(tag_levels = "A")
#png(filename = "Box1_map.png", width = 10, height = 10, units = "in", type = "windows", res = 400)
plot_grid(Box1_map1, Box1_map2, Box1_map3, Box1_map4, labels = c('A','B','C','D'), label_size = 14, nrow = 2, ncol = 2)

Sankey diagrams

We used Data 2 in the main text to illustrate how to make the Sankey diagrams (Figure II). Data 2 was originally collected by Mertens et al. (3) ,who employed a multilevel meta-analytic model to synthesize evidence on the effectiveness of choice architecture interventions (often referred to as nudges) for behaviour change across various techniques, behavioural domains, and other study characteristics (e.g., populations and locations). The core R code came from https://github.com/davidsjoberg/ggsankey/blob/main/R/sankey.R.

Data 2

# load data
dat <- read.csv(here("Data", "Mertens_2021.csv"))
names(dat)[1] <- "publication_id"
# show data in a data
t2 <- dat %>%
    dfround(3) %>%
    DT::datatable()
t2

Visualization

Figure II can be made with:

# pre-process
dat <- dat %>%
    dplyr::select(type_experiment, intervention_technique, intervention_category,
        domain, population, location)

# png(filename = './Box1_sankey.png', width = 6, height = 5, units = 'in', type
# = 'windows', res = 400)
ggplot(dlong(dat, type_experiment, location, population, intervention_category, domain,
    intervention_technique), aes(x = x, next_x = next_x, node = node, next_node = next_node,
    fill = factor(node), label = node)) + sankey_p(flow.alpha = 0.8, node.color = "transparent") +
    sankey_p_label(size = 3, color = "white", fill = "gray10", alpha = 0.6) + ggsci::scale_color_tron() +
    theme_sankey(base_size = 10) + labs(x = NULL) + theme(legend.position = "none",
    plot.title = element_text(hjust = 0.5), axis.text.x = element_text(color = "black",
        size = 9)) + scale_x_discrete(labels = c("Moderator 1 \n (Experimental \n approach)",
    "Moderator 2 \n (Geographical \n location)", "Moderator 3 \n (Population \n characteristic)",
    "Moderator 4 \n (Architecture \n category)", "Moderator 5 \n (Behavioral \n domain)",
    "Moderator 6 \n (Intervention \n technique)"), position = "top")

Phylogenetic trees

We used Data 3 in the main text to illustrate how to make the phylogenetic trees (Figure III). Data 3 was originally collected by Sanders et al. (4) , who used a Bayesian meta-analytic model to synthesize evidence regarding the impacts of artificial light at night on physiological, phenological, life history, activity patterns, and population/community-based outcomes. This meta-analysis included more than 180 species. We only used the physiological outcomes.

Data 3

# load data
dat <- read_xlsx(here("Data", "Sanders_2021.xlsx"))
# show data in a data
t3 <- dat %>%
    dfround(3) %>%
    DT::datatable()
t3

Visualization

Figure III can be made with:

# some data wrangling
dat$Species <- gsub("\\.", "_", dat$Species)
dat <- dat[!is.na(dat$Species), ]  # 

# check species we have length(unique(dat$Species)) #184 unique species names,
# if no misspelling

# find Open Tree Taxonomy (OTT) IDs for each species
taxa <- tnrs_match_names(names = unique(dat$Species))

# for illustrative purpose, we just delete the species that are not matched
# find location
pos <- which(dat$Species %in% c("Phodopus_sungeroru", "Electra?__sp", "Molgula_sp",
    "Baetis_spp", "Agrostis_tenuis", "Anthoxanthum_odoratum", "Myotis_daubentonii",
    "_Myotis_mystacinus", "_Myotis_brandtii", "_Myotis_nattereri", "_Plecotus_auritus",
    "Pipistrellus_pipistrelles", "_Pipistrellus_nathusii", "Nyctalus_nyctalus", "_Nyctalus_leisleri",
    "_Eptesicus_serotinus_", "All_families", "Pipistrellus_hesperidus/Hypsugo_anchietaia",
    "Myotis?_sp", "MCF7", "7288cct", "A__cristatellus_", "A__evermanni", "A__gundlachi",
    "A__sagrei", "bat_community_", "N__leisleri", "Poecile_sp", "Coleoptera", "_Diptera",
    "_Lepidoptera", "_Erebidae", "_Chironomidae", "_Noctuidae_and_Psychodidae", "Nyctalus_and_Eptesicus_spp",
    "Anoectochilus_roxburghii", "Myotis_pilosatibialis", "Silene_latifolia"))
dat <- dat[-pos, ]

# match again
taxa <- tnrs_match_names(names = unique(dat$Species))  # still some are not matched. But let's resolve it latter

# check whether occur in the synthetic tree
ott_in_tree <- ott_id(taxa)[is_in_tree(ott_id(taxa))]
# length(ott_id(taxa)) - length(is.na(ott_in_tree)) # still 10 did not appear
# in synthetic tree

# check which 10
out_tree <- filter(taxa, !ott_id %in% ott_in_tree)
# out_tree$search_string

dat <- dat[!dat$Species %in% c("Leptocythere_pellucida", "Myotis_daubentonii,_Myotis_mystacinus,_Myotis_brandtii,_Myotis_nattereri,_Plecotus_auritus",
    "Pipistrellus_pipistrelles,_Pipistrellus_nathusii", "Nyctalus_nyctalus,_Nyctalus_leisleri,_Eptesicus_serotinus_",
    "Myotis_lucifugus", "Acyrthosiphont", "Carabidae", "Staphylinidae", "Eptesicus_bottae",
    "Corethrella_spp", "P__pipistrellus", "Myotis", "Coleoptera,_Diptera,_Lepidoptera,_Erebidae,_Chironomidae,_Noctuidae_and_Psychodidae",
    "Pipistrellus"), ]

# match again
taxa <- tnrs_match_names(names = unique(dat$Species))
# check again whether all otts occur in the synthetic tree
ott_in_tree <- ott_id(taxa)[is_in_tree(ott_id(taxa))]
# length(ott_id(taxa)) - length(is.na(ott_in_tree)) # all good

# now every ott occur in the synthesistic tree. But for the sake of brevity, we
# only visualize a subset of the tree only use studies measuring physiology
Physiology <- subset(dat, Category == "Physiology")
# remove NA
Physiology <- subset(Physiology, Species != "NA")
taxa <- tnrs_match_names(names = unique(Physiology$Species))

# make phylo tree
tree <- suppressWarnings(tol_induced_subtree(ott_ids = ott_id(taxa)))
# the tip labels contain OTTs, which means they will not perfectly match the
# species names in our dataset or the taxon map that we created earlier, remove
# the extra information from the tip labels later; with the IDs removed, we can
# use our taxon map to replace the tip labels in the tree with the species
# names from dataset
tree$tip.label <- strip_ott_ids(tree$tip.label, remove_underscores = TRUE)

# decapitalise species names to match with the search string names in taxa
Physiology <- Physiology %>%
    mutate(search_string = tolower(Species))

# align data
Physiology <- left_join(Physiology, dplyr::select(taxa, search_string, unique_name,
    ott_id), by = "search_string")

# create the variables of spp and phylo
Physiology <- Physiology %>%
    mutate(spp = unique_name, phylo = unique_name)

# prepare annotation data species-specific estimates as fixed effect
agg.es <- escalc(measure = "SMD", m1i = Experimental_Mean, m2i = Control_Mean, sd1i = Experimental_SD,
    sd2i = Control_SD, n1i = Experimental_N, n2i = Control_N, data = Physiology) %>%
    aggregate(cluster = phylo, struct = "CS", rho = 0.5, addk = TRUE)
agg.es <- agg.es[c("phylo", "yi", "vi")]
names(agg.es) <- c("Species", "Mean", "SE")
agg.es <- agg.es %>%
    mutate(Lower_bound = Mean - sqrt(SE) * qnorm(0.975), Upper_bound = Mean + sqrt(SE) *
        qnorm(0.975)) %>%
    arrange(Species)
N_obs <- Physiology %>%
    group_by(phylo) %>%
    summarise(N_obs = n())  # calculate sample size
names(N_obs)[1] <- "Species"
fe.spp.es <- left_join(agg.es, N_obs, by = "Species")
tip.label <- data.frame(Species = tree$tip.label)  # extract tip label
fe.spp.es2 <- left_join(tip.label, fe.spp.es, by = "Species")
fe.spp.es2 <- fe.spp.es2 %>%
    mutate(z = Mean/SE, p = pnorm(abs(z), lower.tail = F) * 2)

# get class data
spp.class <- dplyr::distinct(Physiology, phylo, .keep_all = TRUE) %>%
    dplyr::select(Class, phylo)
names(spp.class)[2] <- "Species"
spp.class2 <- left_join(tip.label, spp.class, by = "Species")

# make each panel
tree.p1 <- ggtree(tree, layout = "rectangular", cex = 0.4)
tree.p2 <- tree.p1 %<+% spp.class2 + geom_tiplab(aes(color = Class), size = 3, fontface = "italic",
    align = T, offset = 0.3) + geom_tippoint(aes(color = Class)) + xlim_expand(c(0,
    21), panel = "Tree") + guides(color = "none") + scale_color_viridis_d()
tree.p3 <- tree.p2 + geom_facet(panel = "Effect size", data = fe.spp.es2, geom = ggstance::geom_pointrangeh,
    mapping = aes(x = Mean, xmin = Lower_bound, xmax = Upper_bound, color = Class)) +
    theme_tree2() + theme(strip.background = element_rect(fill = "white"))

# png(filename = 'Box1_phylo.png', width = 6, height = 5, units = 'in', type =
# 'windows', res = 400)
facet_widths(tree.p3, c(Tree = 0.4, `Effect size` = 0.2))

Enrichment 2: Bibliometric analysis

The second enrichment analysis is bibliometric anlaysis. It can produce two main graphs: (1) Co-authorship network (Figure IV), and (2) Country network (Figure V).

Co-authorship network

We use the bibliometric data associated with Data 3 to illustrate the construction of co-authorship network (Figure IV) in meta-analytic evidence base. The figure was inspired by Moulin, et al. (5), who originally implemented it using Matlab and VOSviewer.

Data 3

# load data
dat <- read_xlsx(here("Data", "Sanders_2021.xlsx"))
# load bibliographic data
M <- convert2df(here("Data", "Sanders_2021_bib.csv"), dbsource = "scopus", format = "csv") %>%
    suppressMessages()  # note that using here function does work for data wrangling
## 
## Converting your scopus collection into a bibliographic dataframe
## 
## Done!
## 
## 
## Generating affiliation field tag AU_UN from C1:  Done!
# merge data
dat.M <- left_join(M, dat, by = c(DI = "DOI"))  # DI denote DOI in bib data

Visualization

Figure IV can be made with:

# construct author collaboration network
NetMatrix <- biblioNetwork(M, analysis = "collaboration",  network = "authors", sep = ";")
net_matrix <- as.matrix(NetMatrix)
diag(net_matrix) <- 0 
g <- graph_from_adjacency_matrix(net_matrix, mode = "lower", weighted = "weight")
# computing centrality measures for each vertex 
V(g)$indegree <-  igraph::degree(g, mode = "in")
V(g)$outdegree <- igraph::degree(g, mode = "out")
V(g)$closeness <- igraph::closeness(g, mode = "total")
V(g)$betweeness <- igraph::betweenness(g, normalized = TRUE)

#  graph
set.seed(2023)
# using random walks to detect the network community
wtc <- cluster_walktrap(g)
# modularity(wtc)
# modularity(g, membership(wtc))
# member <- communities(wtc)
# n_cluster <- sapply(member, length)

#png(filename = "./Box2_author.png", width = 15, height = 15, units = "in", type = "windows", res = 400)
plot(wtc, g, 
     vertex.size = V(g)$indegree/3, # or V(g)$outdegree + 1
     vertex.label = NA,
     edge.arrow.size = .25)

Besides the visualization, we also can do some quantitative analysis:

# extracting each vectex features as a data.frame
stats <- as_data_frame(g, what = "vertices")

# computing quantiles for each vertex attributes
stats_degree <- with(stats, {
    cbind(indegree = quantile(indegree, c(0.025, 0.5, 0.975), na.rm = TRUE), outdegree = quantile(outdegree,
        c(0.025, 0.5, 0.975), na.rm = TRUE), closeness = quantile(closeness, c(0.025,
        0.5, 0.975), na.rm = TRUE), betweeness = quantile(betweeness, c(0.025, 0.5,
        0.975), na.rm = TRUE))
})

# compute some statistics at the graph level:
cbind(size = vcount(g), nedges = ecount(g), density = edge_density(g), recip = reciprocity(g),
    centr = centr_betw(g)$centralization, pathLen = mean_distance(g))
##      size nedges    density recip      centr  pathLen
## [1,]  576   1739 0.01050121     1 0.01110043 2.834336

Country network

The same bibliometric data can be to illustrate the construction of country network (Figure V) in meta-analytic evidence base.

Data 3

# country
M_country <- metaTagExtraction(M, Field = "AU_CO", sep = ";")
NetMatrix <- biblioNetwork(M_country, analysis = "coupling", network = "countries",
    sep = ";")

net_matrix <- as.matrix(NetMatrix)
# get rid of collaboration with same country
diag(net_matrix) <- 0
# getting rid of lower triangle
net_matrix[lower.tri(net_matrix)] <- 0

# colnames(net_matrix) - change to title case:
colnames(net_matrix) <- str_to_title(colnames(net_matrix))
# rownames(net_matrix) - change to title case:
rownames(net_matrix) <- str_to_title(rownames(net_matrix))
# Fix 'Usa' to 'United States' :
colnames(net_matrix)[colnames(net_matrix) == "Usa"] <- "United States"
rownames(net_matrix)[rownames(net_matrix) == "Usa"] <- "United States"

Visualization

Figure V can be made with:

# color palette
color <- viridis::viridis(34, alpha = 1, option = "D")
color <- color[sample(1:34)] 

circos.clear()
circos.par(start.degree = 90, 
           gap.degree = 3,  
           points.overflow.warning = FALSE)

#png(filename = "Box2_country.png", width = 5, height = 5, units = "in", type = "windows", res = 400)
chordDiagram(net_matrix, 
             grid.col = color, transparency = 0.1, 
             directional = 1,direction.type = c("arrows", "diffHeight"), diffHeight  = -0.04, annotationTrack = "grid", annotationTrackHeight = c(0.05, 0.1),link.arr.type = "big.arrow",link.sort = TRUE, link.largest.ontop = TRUE,
             preAllocateTracks = 1)
# add text and axis
circos.trackPlotRegion(track.index = 1, panel.fun = function(x, y) {
  xlim = get.cell.meta.data("xlim")
  ylim = get.cell.meta.data("ylim")
  sector.index = get.cell.meta.data("sector.index")
  # add names to the sector. 
  circos.text(x = mean(xlim), y = ylim[1] + .01, 
              labels = sector.index, facing = "clockwise", # clockwise
              niceFacing = TRUE, cex = 0.6, adj = c(0, 0.5))
  # add graduation on axis
  circos.axis(h = "top", labels.cex = 0.01, major.tick.length = 0.01, sector.index = sector.index, track.index = 2, labels.niceFacing = T)
}, bg.border = NA)

Enrichment 3: Altmetric analyses

We used Data 4 in the main text to illustrate how to visualize the alternative impact metrics (Figure VI). Data 4 was originally used to examine the replicability of the preclinical cancer biology (6).

Data 4

# load meta-analytic data
dat <- suppressMessages(read_csv(here("Data", "Timothy_2021.csv")))
# use a custom function to extract data using API because it takes a while to
# extract data using API, we just upload the pre-extracted data.
altmetrics <- read.csv(here("Data", "altmetrics.csv"))
#-------------------not run-------------------#
# altmetric.crawler <- list(NULL) for (n in 1:length(dat$DOI)) { # JASON format
# altmetric.crawler[[n]] <- try(list(format.Altmetric(getAltmetrics(doi =
# dat$DOI[n]))),silent=TRUE) }

# get lists within lists altmetric.crawler2 <- sapply(altmetric.crawler,
# function(x) {x}) retrieve stats altmetrics <-
# altmetric_summary(altmetric.crawler2) save(data, file = 'data.Rdata')
# write.csv(altmetrics,file = 'Data/altmetrics.csv')
#-------------------not run-------------------#
# show data in a datatable
t4 <- altmetrics %>%
    DT::datatable()
t4

Visualization

Figure VI can be made with:

altmetrics2 <- altmetrics %>%
    distinct(paper, .keep_all = TRUE)
altmetrics2 <- altmetrics2 %>%
    mutate(count = Policy + Patent)
altmetrics2 <- altmetrics2 %>%
    mutate(group = rep("", nrow(altmetrics2)))

# png(filename = 'Box3_altmetric1.png', width = 5, height = 3, units = 'in',
# type = 'windows', res = 400)
Box3_altmetric1 <- ggplot2::ggplot() + ggbeeswarm::geom_quasirandom(data = altmetrics2,
    ggplot2::aes(y = Altmetric.score, x = group, size = count), fill = "#1B9E77",
    col = "#999999", alpha = 0.8, shape = 21) + ylim(0, 300) + ggplot2::coord_flip() +
    ggplot2::theme_bw() + ggplot2::guides(fill = "none", colour = "none") + ggplot2::theme(legend.position = c(0,
    1), legend.justification = c(0, 1)) + ggplot2::theme(legend.title = ggplot2::element_text(size = 9)) +
    ggplot2::theme(legend.direction = "horizontal") + ggplot2::theme(legend.background = ggplot2::element_blank()) +
    ggplot2::labs(x = "Social media interest", y = "Altmetric score", size = latex2exp::TeX("Policy and patent citations")) +
    # ggplot2::theme(axis.ticks.y = element_blank()) +
ggplot2::theme(axis.text.y = ggplot2::element_text(size = 10, colour = "black", hjust = 0.5,
    angle = 0))

# add altmetrics and transnational info to an evidence map load meta-analytic
# data
dat <- suppressMessages(read_csv(here("Data", "Timothy_2021.csv")))
altmetrics <- read.csv(here("Data", "altmetrics.csv"))

# add altemtrics to the original data
dat <- dat %>%
    mutate(Altmetric.score = altmetrics$Altmetric.score, policy = altmetrics$Policy,
        patent = altmetrics$Patent)

dat <- dat[, c("Original paper journal", "Replication study fully completed", "Altmetric.score",
    "Paper #")]
names(dat) <- c("dimension1", "dimension2", "score", "study_id")
dat$score <- round(dat$score, 0)

# delete the cases where the replications have not be conducted due to the lack
# of experimental protocol details
dat <- dat[!dat$dimension2 == "Not applicable", ]

# run altmetric_aggregate function through each combination
est_dat <- dat %>%
    group_by(dimension1, dimension2) %>%
    group_modify(~altmetric_aggregate(.x)) %>%
    ungroup()

est_dat1 <- est_dat

# policy and patent counts
dat <- suppressMessages(read_csv(here("Data", "Timothy_2021.csv")))

# add altemtrics to the original data
dat <- dat %>%
    mutate(Altmetric.score = altmetrics$Altmetric.score, policy = altmetrics$Policy,
        patent = altmetrics$Patent)
dat <- dat %>%
    mutate(count = patent + policy)

dat <- dat[, c("Original paper journal", "Replication study fully completed", "count",
    "Paper #")]
names(dat) <- c("dimension1", "dimension2", "count", "study_id")
dat$count <- round(dat$count, 0)

# delete the cases where the replications have not be conducted due to the lack
# of experimental protocol details
dat <- dat[!dat$dimension2 == "Not applicable", ]

# run translation_aggregate function through each cell
est_dat <- dat %>%
    group_by(dimension1, dimension2) %>%
    group_modify(~translation_aggregate(.x)) %>%
    ungroup()

est_dat2 <- est_dat

colnames(est_dat2)[colnames(est_dat2) == "estimate"] <- "count"
est_dat2 <- est_dat2[, 1:3]

# combine
est_dat <- merge(est_dat1, est_dat2, by = c("dimension1", "dimension2"))
est_dat$count <- round(est_dat$count, 1)

est_dat <- est_dat %>%
    mutate(dimension2 = case_when(dimension2 == "No replication" ~ "Incomplete replication",
        dimension2 == "Full replication" ~ "Full replication", dimension2 == "Partial replication" ~
            "Partial replication"))


# png(filename = 'Box3_altmetric2.png', width = 6, height = 6, units = 'in',
# type = 'windows', res = 400)
Box3_altmetric2 <- ggplot(est_dat, aes(x = dimension1, y = dimension2, size = count,
    color = estimate)) + geom_point(alpha = 0.6) + scale_color_gradient(low = "#E6AB02",
    high = "purple", limits = c(0, 400), guide = "colourbar") + labs(x = "Dimension 1 (Journal)",
    y = " Dimension 2 (Replication completion)", color = "Altmetric score") + scale_size_identity() +
    scale_x_discrete(labels = function(x) str_wrap(x, width = 10)) + scale_y_discrete(labels = function(x) str_wrap(x,
    width = 10)) + theme_bw() + guides(size = "none") + theme(legend.position = "top",
    legend.justification = "right", legend.direction = "horizontal", axis.text = element_text(color = "black"),
    axis.title = element_text(color = "black")) + geom_text(aes(label = as.character(count)),
    size = 2.5, color = "gray10", fontface = "bold") + labs(caption = "The value in the cell is the citation count of policies and patents") +
    theme(plot.caption = element_text(size = 8, color = "gray10", face = "italic"))

# png(filename = 'Box3_altmetric.png', width = 5, height = 7, units = 'in',
# type = 'windows', res = 400)
plot_grid(Box3_altmetric1, Box3_altmetric2, labels = c("A", "B"), label_size = 14,
    nrow = 2, ncol = 1, rel_heights = c(1, 2))

License

This documented is licensed under the following license: CC Attribution-Noncommercial-Share Alike 4.0 International.

Software and package versions

sessionInfo() %>%
    pander()

R version 4.0.3 (2020-10-10)

Platform: x86_64-w64-mingw32/x64 (64-bit)

locale: _LC_COLLATE=Chinese (Simplified)China.936, _LC_CTYPE=Chinese (Simplified)China.936, _LC_MONETARY=Chinese (Simplified)China.936, LC_NUMERIC=C and _LC_TIME=Chinese (Simplified)China.936

attached base packages: stats, graphics, grDevices, utils, datasets, methods and base

other attached packages: igraph(v.1.3.0), circlize(v.0.4.15), bibliometrix(v.4.0.2), flextable(v.0.8.6), ggtree(v.3.7.1.001), ggstance(v.0.3.5), ape(v.5.6-1), cowplot(v.1.1.1), rotl(v.3.0.11), formatR(v.1.11), pander(v.0.6.4), ggthemr(v.1.1.0), viridis(v.0.6.2), viridisLite(v.0.4.0), car(v.3.0-11), carData(v.3.0-4), lme4(v.1.1-26), metafor(v.4.2-0), numDeriv(v.2016.8-1.1), metadat(v.1.2-0), Matrix(v.1.5-3), readxl(v.1.3.1), ggpubr(v.0.4.0), DT(v.0.19), here(v.1.0.1), forcats(v.0.5.2), stringr(v.1.5.0), dplyr(v.1.0.10), purrr(v.0.3.4), readr(v.2.1.2), tidyr(v.1.2.1), tibble(v.3.1.8), ggplot2(v.3.4.4), tidyverse(v.1.3.1), rmdformats(v.1.0.3) and knitr(v.1.37)

loaded via a namespace (and not attached): pacman(v.0.5.1), utf8(v.1.2.2), tidyselect(v.1.2.0), htmlwidgets(v.1.5.3), FactoMineR(v.2.6), grid(v.4.0.3), clubSandwich(v.0.5.3), munsell(v.0.5.0), codetools(v.0.2-18), ragg(v.1.2.5), statmod(v.1.4.35), rentrez(v.1.2.3), withr(v.2.5.0), colorspace(v.2.0-0), highr(v.0.9), uuid(v.0.1-4), rstudioapi(v.0.13), leaps(v.3.1), ggsignif(v.0.6.3), officer(v.0.6.0), fontLiberation(v.0.1.0), labeling(v.0.4.2), emmeans(v.1.6.3), bit64(v.4.0.5), farver(v.2.1.0), rprojroot(v.2.0.2), coda(v.0.19-4), vctrs(v.0.5.0), treeio(v.1.14.4), generics(v.0.1.0), TH.data(v.1.1-0), xfun(v.0.29), fontquiver(v.0.2.1), R6(v.2.5.1), ggbeeswarm(v.0.6.0), cachem(v.1.0.6), gridGraphics(v.0.5-1), assertthat(v.0.2.1), vroom(v.1.5.7), promises(v.1.2.0.1), scales(v.1.2.1), multcomp(v.1.4-17), beeswarm(v.0.4.0), gtable(v.0.3.0), multcompView(v.0.1-8), sandwich(v.3.0-1), rlang(v.1.1.1), systemfonts(v.1.0.4), scatterplot3d(v.0.3-41), dimensionsR(v.0.0.3), GlobalOptions(v.0.1.2), splines(v.4.0.3), rstatix(v.0.7.0), lazyeval(v.0.2.2), broom(v.1.0.1), yaml(v.2.2.1), abind(v.1.4-5), modelr(v.0.1.8), tidytext(v.0.3.4), crosstalk(v.1.1.1), backports(v.1.2.1), httpuv(v.1.6.2), tokenizers(v.0.2.1), tools(v.4.0.3), bookdown(v.0.24), pubmedR(v.0.0.3), ggplotify(v.0.0.9), ellipsis(v.0.3.2), jquerylib(v.0.1.4), RColorBrewer(v.1.1-3), latex2exp(v.0.9.4), plyr(v.1.8.6), Rcpp(v.1.0.8.3), rscopus(v.0.6.6), progress(v.1.2.2), prettyunits(v.1.1.1), openssl(v.1.4.4), zoo(v.1.8-9), haven(v.2.4.3), ggrepel(v.0.9.1), cluster(v.2.1.0), factoextra(v.1.0.7), fs(v.1.5.2), crul(v.1.3), magrittr(v.2.0.3), data.table(v.1.14.0), openxlsx(v.4.2.4), reprex(v.2.0.1), mvtnorm(v.1.1-3), SnowballC(v.0.7.0), bibliometrixData(v.0.3.0), hms(v.1.1.0), patchwork(v.1.1.1), mime(v.0.11), evaluate(v.0.14), xtable(v.1.8-4), XML(v.3.99-0.7), rio(v.0.5.29), shape(v.1.4.6), gridExtra(v.2.3), compiler(v.4.0.3), fontBitstreamVera(v.0.1.1), crayon(v.1.4.1), minqa(v.1.2.4), htmltools(v.0.5.2), ggfun(v.0.0.8), later(v.1.3.0), tzdb(v.0.1.2), aplot(v.0.1.8), lubridate(v.1.7.10), DBI(v.1.1.1), dbplyr(v.2.1.1), MASS(v.7.3-54), boot(v.1.3-28), cli(v.3.4.1), parallel(v.4.0.3), pkgconfig(v.2.0.3), flashClust(v.1.01-2), rncl(v.0.8.4), foreign(v.0.8-81), plotly(v.4.9.4.1), xml2(v.1.3.2), vipor(v.0.4.5), bslib(v.0.3.0), stringdist(v.0.9.7), estimability(v.1.3), rvest(v.1.0.1), yulab.utils(v.0.0.5), janeaustenr(v.1.0.0), digest(v.0.6.27), httpcode(v.0.3.0), rmarkdown(v.2.11), cellranger(v.1.1.0), tidytree(v.0.4.1), gdtools(v.0.3.1), curl(v.4.3.2), shiny(v.1.6.0), nloptr(v.1.2.2.2), lifecycle(v.1.0.3), nlme(v.3.1-151), jsonlite(v.1.7.2), askpass(v.1.1), fansi(v.0.5.0), pillar(v.1.8.1), ggsci(v.2.9), lattice(v.0.20-41), fastmap(v.1.1.0), httr(v.1.4.2), survival(v.3.2-7), glue(v.1.6.2), zip(v.2.2.0), bit(v.4.0.4), stringi(v.1.7.4), sass(v.0.4.0), textshaping(v.0.3.6), gfonts(v.0.2.0), memoise(v.2.0.0) and mathjaxr(v.1.2-0)

References

1.
A. Hodkinson, P. Bower, C. Grigoroglou, S. S. Zghebi, H. Pinnock, E. Kontopantelis, M. Panagioti, Self-management interventions to reduce healthcare use and improve quality of life among patients with asthma: Systematic review and network meta-analysis. BMj 370 (2020).
2.
J. R. Polanin, Q. Zhang, J. A. Taylor, R. T. Williams, M. Joshi, L. Burr, Evidence gap maps in education research. Journal of Research on Educational Effectiveness 16, 532–552 (2023).
3.
S. Mertens, M. Herberz, U. J. Hahnel, T. Brosch, The effectiveness of nudging: A meta-analysis of choice architecture interventions across behavioral domains. Proceedings of the National Academy of Sciences 119, e2107346118 (2022).
4.
D. Sanders, E. Frago, R. Kehoe, C. Patterson, K. J. Gaston, A meta-analysis of biological impacts of artificial light at night. Nature Ecology & Evolution 5, 74–81 (2021).
5.
T. C. Moulin, O. B. Amaral, Using collaboration networks to identify authorship dependence in meta-analysis results. Research Synthesis Methods 11, 655–668 (2020).
6.
T. M. Errington, M. Mathur, C. K. Soderberg, A. Denis, N. Perfito, E. Iorns, B. A. Nosek, Investigating the replicability of preclinical cancer biology. Elife 10, e71601 (2021).